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A. Background 



In [1] a method was developed for solving separable nonconvex op- 
timization problems. The method can be viewed as an extension of the 
Dantzig Wolfe convex programming algorithm to nonconvex problems. The 
extension involves imbedding the Dantzig Wolfe approximating linear 
programs in a branch and bound algorithm. At each stage of the branch 
and bound search a restricted master approximating linear program is solved 
over a subset of the original feasible region. Dual variables from the 
L.P. solution are used in (nonconvex) single variable Lagrangian sub- 
problem minimizations which price out new trial solutions for the master 
L.P. The results of the Lagrangian subproblem solutions are 

a) a global optimality test 

b) (perhaps) new columns for the master L.P. 

c) (perhaps) a branch in the branch and bound algorithm to further 
partition the feasible region. 

Details of the method are given in [1], and an extension to e-optimality 
is discussed in [2]. 

The purpose of the paper is to show that essentially the same method 
can be applied to nonseparable nonconvex optimization problems. The 
changes involve 

a) the Lagrangian subproblem minimizations no longer decompose into 
single variable minimizations. 

b) the branching rules for branch and bound are modified. The new 
rules are a significant improvement and would probably be advan- 
tageous in the separable case also. 

The primary advantage of the original method, which is retained in 
the extension of this paper, is that nonconvexity only needs to be directly 
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considered in the essentially unconstrained Lagrangian subproblems. 

The efficiency of various methods for such subproblems has been considered 
in [3]. 

The extension is "formal" in the sense that although all the mathe- 
matical operations are valid and capable of routine computer implementation, 
the convergence of the algorithm for arbitrary nonconvex problems has not 
yet been demonstrated. 

The report is intended to be read along with [1] although some material 
presented there has been restated here in nonseparated form for clarity. 



B. Preliminary Results 

We consider the general possibly nonconvex bounded nonlinear optimization 
problem 

NLP min f(x) (1) 

subject to g^(x) ^ 0 i=l m 

a* ^ X* ^ b» 3~lj***j n. 

J J J 



Let C = {x|a. ^ x. ^ b. V-} 

J J J J 



( 2 ) 



Consider the linearization of NLP obtained by selecting vectors 

Xj^eC , k=l , 2 r, and defining for each X|^ a convex combination 

= 1 . Let P^ be the restricted 
master approximating linear program defined as 

r 

P^ min I X|^f(X|^) (3) 

k“ 1 

r 

subject to y ^0 m 

1^=1 K 1 k 



weight variable ^ 0 with 







k=l ,. . . , r 
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If X* is the primal optimal solution to the linear program, then 



>'* = I V \ 



( 4 ) 



defines an approximate solution point for the original NLP. 

If NLP is a convex program, then the relationships between NLP 
and P^ are well understood. These relationships have been used to 
develop a column generation algorithm which uses the dual variables from 
P^ in Lagrangian subproblems to iteratively find new vectors X|^ and 
hence new columns for the restricted master P^ . This method, which can 
be viewed as the nonlinear analogue of the Dantzig Wolfe Decomposition 
Principle has been proved to converge in [4] . 

In the nonconvex case the P^ and NLP relationships are not as 
simply understood. In particular, P^ sometimes underestimates and 
sometimes overestimates the original NLP . Nevertheless the P^ lineari- 
zation can be used in an algorithm for solving NLP . 

Suppose that ttcR'" , acR are the dual variables for P^ , and 
define the Lagrangian function for NLP 



m 



L(x,tt) = f(x) - I TT.g.(x) . 

i=l ^ ^ 

The dual problem to P, is 



(5) 



Dual max a 



( 6 ) 



subject to 




7T ^ 0 , 0 unrestricted 



which can easily be rewritten as 



max min L(X|^,ir) . 
TT^O k=l,...,r 



(7) 
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If the inner miniinization were over all x e C then (7) would be the 
standard Lagrangian dual for NLP. 

Lemiria 1 (Lower bound) 

X* feasible for NLP, it ^ 0 =» 

min L(x,ir) ^ f(x*) . (8) 

xeC 

Proof 

min L(x,tt) ^ L(x*,tt) = f(x*) - ^ iT.g.(x*) ^ f(x*) 
xeC 1 ^ ^ 

since ir^ ^ 0 and g.(x*) 0 

Suppose we solve P^ obtaining optimal primal variables X and 

optimal dual variables it, a with objective function value Z (=a) . 

/\ 

Let X globally solve the nonconvex Lagrangian subproblem 

min L(x,tt) , (9) 

xeC 

and let x* = X|^ X|^ as in (4). 

Theorem 1 (Optimality test) 

If a) f(x*) ^ Z 

b) g.j(x*) ^ 0 

/s 

c) L(x,it) ^ a 

then X* is globally optimal for NLP. 

Proof 

/s 

1 = 0 ^ L(x,tt) ^ min{f(x) | x feasible} ^ f(x*) ^ Z 

for NLP 

Thus X* solves NLP . .U. 

A version of this theorem which allows for tolerances in conditions 
a) > b) , c), and the Lagrangian minimization and which implies e-global 
optimality can be easily developed as in [2] . 



4 



The primary value of Theorem 1 is that if optimality is not achieved, 
then it suggests further actions for the optimization algorithm. In par- 

✓s 

ticular, if c) is violated, then the vector x generates a column 

A /s 

[f(x)» g(x), 1] with negative reduced cost in the simplex tableau for 
/\ 

. Thus X should be incorporated as a new X|^ point. If a) or 
b) is violated, then the NLP is not convex at the current convex combina- 
tion X* . In this case the algorithm must resort to branch and bound 
to enforce a different convex combination which (hopefully) does not violate 
convexity so badly. In [1] the choice of variable on which to branch was 
simple and depended on the separated components f. , g. . of the separable 

J I J 

functions f and g^- , where for example, 

f(x) = I f.(x.) (10) 

j=l ^ 

The major point of the paper is that reasonable selection rules can be 
developed even in nonseparable cases. Before developing these rules we 
state the entire algorithm for the nonseparable case in detail. 

C. The Algorithm 
Step 1 Initialization 

Choose an initial set of vectors Xj^ e C . Let P^^ with t=l 
( = subproblem counter) be the P^ program corresponding to this initial 
set. Let C.^ = [a., b.] . Let L^ = - <» be the current largest lower 

J J J 

bound for P^^ . Let F° = +«> be the value of f(x) for the best incum- 
bent feasible solution to NLP found so far. Place P^^ on a list of 
subproblems and go to step 2. 

Step 2 Linear Program 

If the list of subproblems is empty, stop--the incumbent solution is 
global optimal. Otherwise select a problem P^^ from the list (see 
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discussion in section D.) and solve it yielding optimal value Z with 
optimal primal variables X and optimal dual variables i\>a . 

(If is infeasible then the method is slightly modified. See [1] for 

details.) Go to step 3. 

Step 3 Laqranqian Minimization 

Solve the nonconvex Lagrangian problem min L(x,tt) giving solution 

xeC 

X . Let B = L(x,tt) . If B F° then fathom and go to step 2. 

If F° > B > then increase the value of the bound for P^ to = B 
and go to step 4. Otherwise go to step 4 without changing the bound. 

Step 4 New Grid Points 

If L(x,tt) < a then use x to generate a new column for P^^ . 

Place the new P^^ on the list and go to step 2. 

If L(x,tt) a then go to step 5. 

Step 5 Optimality Test 

Compute X* from X using (4). If g^ (x*) ^ 0 , i=l m, and 

if f(x*) < F° then replace F° with f(x*) and let x* be the new 

incumbent solution. 

If a) f(x*) ^ Z and b) g^-(x*) ^0 Vi , then x* is global 

optimal for the NLP subproblem over x e . Go to step 2. 

If a) or b) is violated, go to step 6. 

Step 6 Branch 

Use X* to generate a new column (nonbasic) for P^^ . Select a 
coordinate x.* j=l,...n of x* (see discussion in section D.) and 

J 

branch creating two new subproblems 
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a) restricted to x. ^ x.* (include only those columns for 

A J J 

which (x. ). ^ X.*) and 

K J J 

b) restricted to x. ^ x.* (include only those columns for 

A J J 

which (X|^)j ^ Xj*) 

Compute a bound for each of these subproblems and place both on 
the list. Go to step 2. 



D. Branch and Problem Selection Rules 

Two aspects of the method remain to be described: the rule for 

selecting the next subproblem P^^ to examine in step 2 and the rule 
for selecting the component x.* of x* on which to branch in step 6. 

J 

In this section we show that penalty calculations can be used for these 
decisions. The penalties used were originally considered in the context 
of integer programming and have been applied to separable nonconvex 
optimization using special ordered sets in [5]. In fact, neither 
separability nor the ordered set property is necessary as we shall show. 
Consider the following linear program 



min cx 

St Ax = b (11) 

X 0 

with optimal basis B and optimal solution 

Xg = B"^b ; X|^ = 0 (12) 

Suppose we have available the optimal simplex tableau containing 
the transformed constraint matrix 

T = B'^A (13) 

and also the row of reduced cost coefficients 
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(14) 



C = C - CgB“'A 

A standard result of post optimality analysis is that if we force a basic 

variable Xn to zero, and let the remaining variables adjust optimally, 
^i 

then a first order approximation to the resulting objective function 
change is 



Q. = Xp ( min |-r^l ) 

' °i \ j nonbasic ( "^i j ; / 



(15) 



tij>0 



This approximation is called the "penalty" for forcing Xg to zero. 

It gives the exact change if no basis changes occur before Xg reaches 

zero, and is otherwise an under-estimate. 

In the context of branch selection for in step 6 of the algorithm 
we wish to compute a penalty for each component x. for the two resulting 
subproblems. Each subproblem involves dropping several of the current 
vectors X|^ , or equivalently forcing the corresponding variables X|^ 
to zero. 

Suppose S is a set of variables which we want to force to zero 
if basic or maintain at zero if nonbasic. Then the penalty for this 
action is 



= max 
Xr eS 
^i 






min 

i |_ nonbasic ' ij - 
X. i S 



( 16 ) 






In the context of , then, the appropriate penalties are obtained from 
(16) by setting S to be 









(17) 
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for subproblem b) of step 6 and 



Sj- = {X, I (x^)j > Xj*) (18) 

for subproblem a) for each j=l,..., n . 

Intuitively a subproblem with a large penalty is unlikely to contain 
the global optimal solution to NLP. Thus branch selection in step 6 can 
be performed so that one of the two resulting subproblems is most likely 
to contain the solution by selecting j to satisfy 

min Q." (19) 

j ^ 

or possibly 

min I q/ - Q.~ I (20) 

d 

In either case the choice of the next to work on in step 2 can then 
be the highly likely subproblem, placing the unlikely candidate on the 
list and hoping that it will be fathomed before it has to be solved. 

It should be emphasized that in the nonconvex case the penalties 
yield only a guide and not guaranteed bounds on the new subproblems. 

Thus they should not be used to infer new bounds on the resulting 
subproblems after a branch. 

As in all branch and bound procedures the actual choices of sub- 
problem and branch selection rules should be governed by computational 
experience. 
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